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Abstract — In this article we consider the inversion problem for polynomially computable discrete functions. 
These functions describe behavior of many discrete systems and are used in model checking, hardware 
verification, cryptanalysis, computer biology and other domains. Quite often it is necessary to invert these 
functions, i.e. to find an unknown preimage if an image and algorithm of function computation are given. 
In general case this problem is computationally intractable. However, many of it's special cases are very 
important in practical applications. Thus development of algorithms that are applicable to these special cases 
is of importance. The practical applicability of such algorithms can be validated by their ability to solve the 
problems that are considered to be computationally hard (for example cryptanalysis problems). In this article 
we propose the technology of solving the inversion problem for polynomially computable discrete functions. 
This technology was implemented in distributed computing environments (parallel clusters and Grid-systems). 
It is based on reducing the inversion problem for the considered function to some SAT problem. We describe a 
general approach to coarse-grained parallelization for obtained SAT problems. Efficiency of each parallelization 
scheme is determined by the means of a special predictive function. The proposed technology was validated by 
successful solving of cryptanalysis problems for some keystream generators. The main practical result of this 
work is a complete cryptanalysis of keystream generator A5/1 which was performed in a Grid system specially 
built for this task. 
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1 Introduction 

Let {0, 1}" , n 6 Ni, be the set of all possible bi- 
nary sequences of the length n. We also use the 
notation {0,1}* ~ UneNi{0>l}"- This work 
focuses on the families of discrete functions 
of the form 

/ = {/nUN,> /n:{Oar ^{0,1}*. 
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We consider the class formed by the families 
of discrete functions for which the following 
conditions are satisfied: 

1) for every n G Ni the function /„ is de- 
fined everywhere on {0,1}" (we denote 
this fact as domfn = {0, 1}"); 

2) there exists a program M j for determin- 
istic Turing machine which computes an 
arbitrary function of the family /; 

3) the time complexity of the program Mj 
increases with the increase of n as a 
polynomial in n. 

Hereafter we write /„ G / to indicate the 
fact that the discrete function /„ belongs to a 
family with the properties 1-3. For a discrete 
function /« G / given in the form of a pair 
{Mf, n) and a word y G range /„, the problem 
we are interested in is to find x G domfn, 
such that y = /„ (a;). We call this problem 
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the inversion problem for the function /„ in 
the point y. The general inversion problem for 
family / with properties 1-3 we denote by 
Iiiv (/). 

In the present article we describe an ap- 
proach that is based on the possibility to re- 
duce effectively the problem Inv (/) to SAT. 

In the process of reduction of the problem 
Inv (/) to a SAT problem there is a possibil- 
ity to single out from the set of variables of 
the obtained CNF the subset corresponding 
to the "input variables" of the considered 
function. This is fundamental for construct- 
ing of decompositions of SAT problems into 
SAT problems of lower dimension with their 
subsequent solving in distributed computing 
environments. We show that the use of this 
simple principle gives good results for the 
inversion problems of some discrete functions 
used in cryptography. 

Let us give a brief outline of the article. 
In the second section we give basic notions 
of the theory of discrete functions and briefly 
describe a technology for reducing inversion 
problems of the functions computable in poly- 
nomial time to SAT problems, focusing on the 
functions used in cryptography. 

In the third section, technology of coarse- 
grained parallelism that we use for solving 
SAT problems, is described. Using this tech- 
nology we decompose given SAT problem into 
a family of SAT problems of lower dimension. 
Such decompositions can be performed in dif- 
ferent ways. We are interested in selecting a 
decomposition that is good in terms of overall 
computing time. This is performed by solving 
optimization problem for a special predictive 
function. 

In the fourth section, we describe some 
modifications intended to improve efficiency 
of a basic SAT solver for the problem Inv (/). 
Here we also give the results of a successful 
cryptanalysis of some keystream generators 
performed on a low-performance computing 
cluster. 

The fifth section is entirely devoted to the 
use of the technology presented in this paper 
for solving the problem of cryptanalysis of 
the keystream generator A5/1 using a Grid 
system specially constructed for this purpose. 



2 Reducing the discrete func- 
tions INVERSION PROBLEMS TO SAT 
PROBLEMS 
2.1 Basic notions 

Hereafter X — {xi,...,a;„} denotes the set 
of Boolean variables. Let L (xi, . . . , x„) be an 
arbitrary propositional formula. We denote by 
L (ai, . . . , an) ~ (3 the fact that the result of 
substitution 

ai G {0,1}, i e into the formula 

L{xi,...,Xn) is /3 e {0,1}. 

The expressions of the form L{xi, . . . , x„) = 
0, L{xi, . . . ,Xn) = 1 are called Boolean equa- 
tions (see [1]). For a fixed /3 e {0,1} a 
solution of the equation L{xi, . . . ,Xn) ~ P 
is a vector (ai, . . . , a„) e {0, 1}", such that 
L (ai, . . . , a„) = p. If such a vector does not 
exist we say that the Boolean equation does 
not have a solutions. 

The terms Xi,Xi,i £ {l,...,n}, are called 
literals over X. The literals x and x are called 
complementary literals. A clause over X is an 
arbitrary disjunction of literals over X, which 
does not have repetitive and complementary 
literals. Conjunctive normal form (CNF) over 
X is an arbitrary conjunction of different 
clauses over X. 

Let C {xi, . . . ,Xn) (shortly C) be an arbi- 
trary CNF over the set of Boolean variables 
X ~ {.Ti, a;„}. The vector (ai,...,Q;„) G 
{0, 1}" is called a satisfying assignment of C, 
if C (ai, a„) = 1. A CNF for which there 
exists a satisfying assignment is called a satis- 
fiable CNF, otherwise it is called unsatisfiable. 
The problem of deciding satisfiability of an 
arbitrary CNF as well as the problem of search 
of a satisfying assignment for an arbitrary 
satisfiable CNF are the problems we consider 
below. 

S.A. Cook showed in [2], that the process 
of executing a program M, which stops on an 
arbitrary input, on a Turing machine with the 
input alphabet S = {0,1} can be represented 
by a system of Boolean equations. 

Let / be a family of discrete functions from 
the class defined above. An arbitrary function 
/„ e / given by a pair (M/, n) will be consid- 
ered as a function of Boolean variables from 
the set X = {zi, . . . , a;„}. The set X we call 
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the set of input variables of the function /„. 
According to [2] there exists an algorithm with 
a time complexity bounded by polynomial in 
n which given a pair {Mf,n), transforms the 
problem of inversion of /„ in an arbitrary 
point y e range /„ into the problem of finding 
solutions of the equation of the type 

C(xi, . . . ,a;^(„)) = 1. (1) 

Here q{-) is some polynomial, and 
C(a;i, . . . , .Tg(„)) is a satisfiable CNF over 
the set of Boolean variables {xi, . . . , Xg(„) }. 
Further we will write " CNF encoding discrete 
function inversion problem" meaning CNF 
C{xi, . . . ,Xq^n)) and "equation encoding 
discrete function inversion problem" meaning 
equation C{xi,. . . ,.t5(„)) = 1. 

It should be particularly noted that the pro- 
cedure for reducing the inversion problem to 
the search for solutions of Boolean equations 
must be parsimonious (see [3], [4]). That is, 
the number of solutions of (1) must coincide 
with the number of preimages of y £ range f„, 
and a procedure of an effective transition from 
an arbitrary solution of (1) to the correspond- 
ing preimage must exist. This is essential for 
an effective inversion of discrete functions in 
practice. 

Not all procedures of propositional encod- 
ing are parsimonious. However, it is not diffi- 
cult to show that well-known Tseitin transfor- 
mations have this property. These transforma- 
tions were proposed by G.S. Tseitin in 1968 in 
[5] (reprinted in [6]). Next, we describe the use 
of Tseitin transformations for the problem of 
parsimonious reducing of Boolean equations 
to normal forms. 

These transformations were described (ex- 
plicitly or implicitly) in a number of sources 
(e.g., [7], [8]) where an original function is 
usually represented by a Boolean circuit 5* (/„) 
over an arbitrary complete basis, for exam- 
ple {&, -i}. Each variable from X corresponds 
to one of n inputs of S{f„). For each logic 
gate G some new auxiliary variable v (G) is 
introduced. We denote the set of all auxiliary 
variables as V. Every AND-gate G is encoded 
by CNF-representation of Boolean function 

V (G) ^ uhw. Every NOT-gate G is encoded 
by CNF-representation of Boolean function 

V (G) ^ -lu. Here u and w are variables 
corresponding to inputs of G. CNF encoding 



S{fn) is 

where C (G) is CNF encoding gate G and 
yi,. . . ,ym are variables corresponding to out- 
puts of S* (/„). 

In our opinion for the problems considered 
in this paper it is more convenient to construct 
Boolean equations that encode considered al- 
gorithms directly and not to use Boolean cir- 
cuits for intermediate representation of these 
algorithms (see Section 2.2). 

We consider the problem of finding solu- 
tions of Boolean equations in the following 
general formulation. 

F (hi (xl,. . . ,xl^) ,.. .,hs{xl,. .. ,x'j) =1. (2) 

The propositional formulae hi (^x\ , . . . , .tJ, . ) , i G 
{!,...,. s}, define some (composite in general 
case) Boolean functions. Let 

s 

X = {xu...,Xn} = U {x\,...,xl^} . 

i=l 

Consider the Boolean equation 

G (lii o hi) ■ Fii^^ui {xi, . . .,Xn,ui) = 1. (3) 

Here by G (ui o hi) we denote a CNF- 
representation of the Boolean function 

ui O hi {xl,...,xlj 

over {xl, ...,xl^,ui} , and by 

Ffii-^ui {xi , . . . , Xn: ^l) 

we denote the propositional formula obtained 
by replacing one or several (perhaps all) for- 
mulae hi [xl, . . . jxjj in (2) by the literal ui. 

The transition from the equation (2) to the 
equation (3) is one iteration of the Tseitin 
transformations in application to the Boolean 
equations. 

Let $1 and $2 are the sets of solutions of the 
equations of (2) and (3) respectively. It is not 
difficult to show that the reduction from (2) to 
(3) described above is parsimonious because 
$1 and $2 are either simultaneously empty 
or there exists one-to-one correspondence be- 
tween them. Implicitly this fact is mentioned 
in [8]. 
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2.2 Logical cryptanalysis. Propositional 
encoding of the keystream generator A5/1 

The concept of reducing the problems of crypt- 
analysis to the problems of finding solutions 
of Boolean equations (in the form of SAT 
problems) was first formulated in [9]. One of 
the first practical implementations of this idea 
was given in [10]. In that paper the problem 
of cryptanalysis of the DES cipher was formu- 
lated as a SAT problem. 

Next, we consider the keystream generator 
A5/1 used to encrypt traffic in GSM networks. 
A lot of attacks on the cipher A5/1 are de- 
scribed, however it is still actively used. A pos- 
sible reason of using A5/1 is the lack of con- 
vincing experimental results of its cryptanaly- 
sis. We consider the problem of cryptanalysis 
of the generator A5/1 on the basis of a known 
keystream. The problem is to find the secret 
key using some fragment of the keystream 
and a known algorithm of its generation (see 
[11]). The description of the generator A5/1 
(see Fig. 1) was taken from the paper [12]. 
According to [12] generator A5/1 contains 
three linear feedback shift register (LFSR, see, 
e.g., [11]), given by the following connection 
polynomials: LFSR 1: X^'^ +X^^+X^'' +X^^+1\ 
LFSR 2: X^"^ + X^^ + 1; LFSR 3: X"^^ + + 
X^i + X^-f 1. 

The secret key of A5/1 generator is the ini- 
tial contents of LFSRs 1-3 (64 bits). In each unit 
of time r G {1, 2, . . . } (r = is reserved for the 
initial state) two or three registers are shifted. 
The register with number /■, r e {1,2,3}, is 
shifted if Xr {bi,bl, bl) ~ 1, and is not shifted 
if Xr (bl,bl,bl) = 0. By bl, bl, bl we denote 
here the values of the clocking bits at the 
current unit of time. The clocking bits are 9-th, 
30-th and 52-nd. Corresponding cells in Fig. 
1 are black. The function (•) is defined as 
follows 



xl (61,62,63) 



1, bl = majority (6[, 65, 65) 
0, 6^ 7^ majority (6[, 65, 65) 



where majority {A,B,C) = A- By A-CW B -C. 

In each unit of time the values in the left- 
most cells of the registers are added mod 2, 
the resulting bit is the bit of the keystream. 

Thus, we can see that the generator A5/1 
updates the content of each of the registers' 
cells as a result of conditional shifts: if the 



Hl9 18 17 16 15 14 13 12 11 lOH 8 7 6 5 4 3 21 



H4140 39 38 37 36 35 34 33 32 3iral29 28 27 26 25 24 23 22 2120h<-| 



H64 63 62 61 60 59 58 57 56 55 54 53^51 50 49 48 47 4645 44 43 42H 



¥7 



Fig. 1 . Scheme of the generator A5/1 . 



shift does not occur, then a new configuration 
of a register does not differ from the old one, 
otherwise values of all cells of the register are 
updated. Hence with each cell at each unit of 
time we can associate a Boolean equation link- 
ing a new state of the cell with the previous 
one. Let variables .xi, . . . , X64 encode the secret 
key of generator A5/1 [xi corresponds to cell 
with number i G {1, • • • , 64}). By x\,. . . , 
we denote variables encoding cells' state in the 
moment of time t = 1. System of equations 
which links these two sets of variables is: 



xl ^ X2 ■ x\y ■ x\] = 1 



C20 -H- X2Q 



{xli O X21 ■ X2 Va;2o ■ X2) 



= 1 



(4) 



( 



x\2 o a;42 ■ xl V {®k^KXh) ■ Xaj = 1 
ia o Xi-i ■ xl V 2;42 ■ Xs) = 1 



(^xl4, o X(i4 ■ xl V xea ■ Xs) = 1 

((?' o xlg e 4i e xlA = 1 



where / = {14,17,18,19}, J = {40,41}, 
K = {49,62,63,64} and is the first bit of 
keystream. 

Let g^,...,g^ be the first L bits of the 
keystream of A5/1. To the each bit g\i e 
{1, . . . , L} we associate a system of the form 
(4) . To find the secret key it is sufficient to find 
a common solution of these systems. The prob- 
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lem of finding of this common solution can 
be reduced by the means of Tseitin transfor- 
mations to the problem of finding a satisfying 
assignment of a satisfiable CNF. 

3 Coarse-grained paralleliza- 
tion of sat problems encod- 
ing discrete functions inversion 
problems 

In this section we describe a technology for 
solving SAT problems in distributed comput- 
ing systems (hereinafter DCS). Such systems 
consist of sets of computing nodes connected 
by a communication network. Each node of a 
DCS has one or several processors. Typical ex- 
amples of DCS are computing clusters which 
have become widespread in recent years. The 
elementary computational units of modern 
DCS are cores of processors. 

One of the first works on parallel algorithms 
for solving SAT problems is the article [13]. 
This work describes a technique of paralleliza- 
tion of the Davis-Putnam procedure (see [14] 
with interprocessor data exchange aimed to 
achieve uniform loading of the processors. 
Similar ideas are basic for those modern SAT 
solvers which use interprocessor exchange of 
conflict clauses (see, e.g., [15]). 

In the present work a different approach to 
parallelization of algorithms for solving SAT 
problems is proposed. This approach is pri- 
marily oriented to solving the problems from 
Inv (/). We show the principal possibility of 
using preliminary calculations to determine 
good parameters of a decomposition of the 
search domain into disjoint subdomains. After 
decomposition the obtained subdomains are 
processed by isolated processors. In this sense 
the main results of this article belong entirely 
to the field of the coarse-grained parallelism 
(see, e.g., [16]). 

We consider an arbitrary CNF C over the 
set of Boolean variables X = {xi, . . . , a;„} and 
select in the set X some subset 

X' = {xi,, . . . ,Xi^} , {ii,..., id} C {!,..., n}, 

where d e {!,..., n}. We call X' = 
{xi^, . . . ,Xi^} a decomposition set and d is the 
power of the decomposition set. To the decom- 
position set X', \X'\ ^ d, we associate the set 
Y {X') ^{Yi,...,Yk} consisting from K = 2^^ 



different binary vectors of the length d, each 
of which is a vector of values of the variables 
x.,,,...,x.,^. By Cj = C\y,, j = 1,...,A', we 
denote the CNF obtained after substitutions 
of the values from the vectors Yj to C. A 
decomposition family generated from the CNF 
C by the set X', is the set Ac {X'), formed by 
the following CNFs: 

Ac {X') = {Ci = C\y,,...,Ck^C\y^}. 

It is not difficult to see that any truth as- 
signment a e {0, 1}" satisfying C {C\a = 1) 
coincides with some vector £ F {X') in 
the components from X' and coincides with 
some satisfying assignment of the CNF C|yc £ 
Ac (X') in the remaining components. In this 
case the CNF C is unsatisfiable if and only 
if all the CNF in Ac {X') are unsatisfiable. 
Therefore, the SAT problem for the original 
CNF C is reduced to K SAT problems for 
CNFs from the set Ac (X'). For processing the 
set Ac {X') as a parallel task list a DCS can 
be used. 

Note that the idea of such parallelization 
itself is not new. A similar in spirit approach 
to SAT problems is presented in [17]. The 
main novelty of our approach consists in the 
described further technique of search of de- 
composition sets with "good" properties. This 
technique is based on a simple procedure of 
statistical prediction. 

In the case of arbitrary SAT problems it 
is not clear how to form decomposition sets. 
However, below we will show that for the SAT 
problem encoding the problem of inversion of 
a discrete function /„ good candidates for the 
role of decomposition sets are subsets of the 
set of input variables of /„. 

If we take as a decomposition set the whole 
set of input variables of /„, then every SAT 
problem in the obtained family is simple, but 
the number of these problems is usually very 
big. Thus there arises the following problem 
of improvement of a decomposition set. Let 
X' be some decomposition set (e.g., the set of 
input variables of /„) . We need to construct a 
set X c X', for which there exist some con- 
clusions about smaller total processing time of 
the list Ac (^X^ ■ 

If the power of the set Ac (^X^ is too high, 
then the prediction of the time of parallel 
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processing of Ac l^Xj can be computed based 
on the average solving time of SAT prob- 
lems for some CNFs that are randomly chosen 
(with uniform distribution) from Ac (^X^ ■ By 

Q (^X^ we denote the size of this sample. 

Introduce a parameter R to distinguish two 
situations: whether it is necessary to form a 
random sample or not. 

To each set X, X C X', such that 
2l^l > R, we associate the set of vectors 



V Y' 



(^) 



selected from Y (X\ with 



uniform distribution and the set of CNFs 



ec\x 



To each X, X C X' , such that 2!-*l < R, we 
associate the set Y \ X] and the set of CNFs 



Be [Xj = Ac [X^ 

Thus, to an arbitrary set O, 17 C 2^ , of the 
choice alternatives of X from X', the set 



&cm^{oc (^)} 



is put in correspondence. 

Hereinafter we consider SAT solvers based 
on DPLL (see [18]). Using properties of this 
algorithm we construct the procedure for pre- 
dicting time of parallel solving of SAT prob- 
lems which encode discrete functions inver- 
sion problems. 

Let S be some SAT solver. Denote by t (C) 
the time of work of the SAT solver S on an 
arbitrary CNF C Consider the function ts : 



X ^ 



C'eec(x) 



However, for some X (e.g., if X consists of 
one variable) CNF from 9c (^X^ can be very 
difficult for the SAT solver. In this case the 
time required for computing the correspond- 
ing value of the predictive function may ex- 
ceed reasonable limits. To take this fact into 
account we introduce a special function g (C). 

Suppose that in accordance with the rules 
above the set 9c (fi) is constructed. We define 



the predictive function as follows. 



{ec{x)) 



T{ec (x)) 



Qix) ■ ' 
if2l^l >R,Ts{9c (X)) <g{C) 

if2l^l <R,Ts{ec (X)) <g(C) 
oo, if rs (6»c (X)) >g{C) 



Notation T yOc [Xj j = oo means that the 
function is not defined in 6c [X]. The value 



T \ 9c \ Xj j is the prediction of the time re- 
quired for a sequential processing of the list 
Ac [x^ ■ Thus the problem of constructing a 
"good" decomposition set is reduced to the 
problem of minimizing the function T {■) on 
the set Qc {^)- Knowing the global minimum 
of T (•) on 6c (P) we can make a conclusion 
about the possibility of parallel solving of a 
SAT problem for the CNF C in "a reasonable 
time". 

Theorem 1 Consider the problem Inv (/). Let 
C be a CNF encoding the problem of inversion 
of function /„,/« & f in an arbitrary point 
y e rangefn- Then there exist sets fl, 9c i^) ■ 
|9c {^) i < 2" and function T {■) such that the 
domain ofT{-) is nonempty and global minimum 
of T (■) on 9c (17) can be found in time 

0{\C\-\Qc 

Proof Suppose that contains the set X 
of input variables of function /„. Here we 
assume that the CNF C encoding a corre- 
sponding inversion problem is constructed in 
accordance with the principles described in 
Section 2.1. It is known (see, e.g., [19]) that 
DPLL remains complete for C even if the set 
of decision variables (see [20]) is limited to 
X. Thus assigning values to all the variables 
from X will result in inference of either a 
satisfying assignment of C or a contradiction 
(conflict) through the unit propagation (UP, 
see [21]). However generally the complexity 
of this process has an upper bound of the 
kind O (|C|). Therefore it is always possible to 
construct some function g{-), g{C) ~ O {\C\) 
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such that for an arbitrary sample 9c (X) the 
value of function T {6c (X)) will be defined. 

Next, we describe an algorithm based on the 
principle of "dynamic programming" for solv- 
ing the problem of minimizing the function 
r(-) on the set Gc {^)- 

Let the initial decomposition set X' be the 
set of the input variables of /„ and 2'"^ ' > R. 
We construct the set 6c (X') and find the value 
T {6c (X')). As already mentioned, this value 
can be found effectively. Consequently, the 
function T {■) is a defined at least in 6c {X'). 
Next, we will try to sequentially improve the 
value of the function T {■) on some fixed set 

C 2^'. 

By i we denote the iteration number of 
the algorithm. The results of the initial {i = 
0) iteration step are the values ts {9c {X')) 
and T{9c{X')). At each iteration step i > 

1 we compute the values of the functions 

TS (9c (x) j and T {9c (^j) for the corre- 
sponding set X We denote them as r' and 
T* respectively. For each i > 1 in the process 
of calculations we perform frequent checks to 
determine whether the current value of T' has 
exceeded the value T*^^ which was found 
at previous iteration step. If this takes place 
then further calculations are useless since the 
calculated value T' won't be better than T'^^. 
Then we interrupt the calculations and go 
to the next iteration step. For example we 
denote by t" the value ts {9c {X')) and cal- 
culate T {9c [x^ ) for some X,X a X' . If the 
inequality 

rs(ec (x)) >2l^'Hx|.^o 

holds and 2l-^l > R, Q{X) = Q {X') then 
T {9c (^xYj > T{9c {X')). Obviously in this 
case the global minimum of T {■) on Qc {^) 
cannot be achieved in 6c {^X^ . If in the process 

of calculating ts {dc {^^^J the bound g (C) is 

exceeded, then the value of T (■) in 6c {^X^ is 
not defined. 

The result of the described procedure of iter- 
ative improvement of the value of the function 
T(-) on Qci^) is a set X, G 2^", such that 
the value of T {6c {X^)) is minimal (among all 
X e n). The number of necessary iterations is 



1 0c {^) \ and upper bound for the time of each 
iteration is 0(|C|). □ 

Note that even if X' is the set of input 
variables of a considered function, processing 
whole set 2"^ for practically important prob- 
lems is unfeasible. Therefore, the peculiarities 
of the original formulation should be taken 
into account and various heuristics should be 
used to form il in each particular case. Of 
course, there is no guarantee that a decom- 
position set better (in terms of total comput- 
ing time) then X^ doesn't exist. However the 
examples described in Section 4 show prac- 
tical effectiveness of the proposed method of 
predictive functions even with quite simple 
techniques of constructing of sets Q. 

Further, we assume that we have some 
decomposition set X» with a good value 
T{6c{X^)) for some CNF C. The decompo- 
sition family generated by the set X, we de- 
note by A, (C) = {Ci,...,Ck}, K = 2\^'\. 
Suppose that the considered DCS has M com- 
puting cores. The following two cases are pos- 
sible. 

1) K < M, i.e. the number of CNF in the 
family A* (C) does not exceed the num- 
ber of cores of the DCS. In this case for 
any CNF from the family A, (C) the SAT 
problem is solved on a separate core. In 
practice this situation is very rare. 

2) K > A/— the number of CNF in the 
family A* (C) is greater then the number 
of cores in the DCS. 

The situation described in 2 is the most 
typical for inversion problems of the crypto- 
graphic functions. In this case the decomposi- 
tion family A* (C) is considered as a task list 
that is processed in parallel according to the 
following scheme. Let us put the CNFs of the 
family A, (C) in some order We call an arbi- 
trary CNF from A* (C) locked if at the current 
moment of time the SAT problem for it has 
either been solved or is being solved on some 
core of the DCS. The other CNFs are called 
free. We select first AI CNFs Ci,...,Cm from 
the family A* (C). For each of the selected 
CNFs we solve the SAT problem on a separate 
core of the DCS. Once some core is released we 
launch the procedure of solving of the SAT 
problem for the first free CNF of the family 
A, (C) on this core. This process continues un- 
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til a satisfying assignment for some CNF from 
A* (C) is found, or until the unsatisfiability of 
all CNFs from A, (C) is proven. 

In the next section we describe a somewhat 
different strategy of processing the list A, (C), 
which significantly reduces the cost of the 
transfer of tasks over a network. 

4 Cryptanalysis of some key- 
stream GENERATORS ON A COMPUT- 
ING CLUSTER 

4.1 Adjustments of a SAT solver to solving 
the problems of cryptanalysis in distributed 
computing environments 

In all experiments described further we use 
a modified SAT solver Minisat-C vl.14.1 (see 
[22]). The first stage of modification consists 
in changing the decision variable selection 
procedure (see [20]) implemented in Minisat. 
Namely a procedure of assignment of initial 
activity (different from zero) for those vari- 
ables in the CNF which correspond to the 
input variables of the function considered was 
added. For the problems of cryptanalysis of 
generators this method allows to select, on the 
initial stage of the solving process, the vari- 
ables corresponding to the secret key as pri- 
ority variables for decision variable selection 
procedure. Also some basic constants of the 
solver were changed. Like most of its analogs 
Minisat periodically changes the activity of all 
the variables and clauses in order to increase 
the priority of selection for variables from the 
clauses derived in the later steps of the search. 
Moreover, in 2% of cases the Minisat assigns a 
value to the variable selected randomly, rather 
than to the variable with the maximum activ- 
ity. These heuristics show, on average, good 
results on a broad set of test examples used in 
the competitions of SAT solvers. However, for 
the CNFs encoding problem of cryptanalysis 
they are, in general, not efficient. In all the 
experiments described below we use the SAT 
solver in which periodical lowering of the 
activity and random selection of variables are 
prohibited. In total, this simple change led to 
a substantial increase in efficiency of the SAT 
solver on cryptographic tests. For example, 
on the CNFs from the decomposition family 
constructed in the process of cryptanalysis of 
the generator A5/1 (see Section 5), the SAT 



solvers Minisat 1.14.1 and Minisat 2.0 did not 
cope with the tasks in 10 minutes of work (the 
computations were interrupted). A modified 
Minisat-C vl.14.1 solved these problems in 
less than 0.2 seconds on average (see Table 3). 

In the preceding section a general procedure 
for parallel processing of a list of tasks was 
described. During this procedure the control 
process monitors the loading of computing 
cores and send new tasks to the released 
cores. In practice, a direct implementation of 
this scheme leads to an excessive growth of 
transfer costs, but provides uniform loading 
of the cores. 

The efficiency of a SAT solver in a DCS can 
be improved by using job batches. Each job 
batch is a subset of the decomposition family 
A, (C). Sending of batches instead of single 
CNFs allows to reduce the cost of the transfer 
We decompose A, (C) into disjoint sets of job 
batches. The obtained set of the job batches is 
considered as a task list where each job batch 
is a list item. For processing this task list we 
use the technique described in the previous 
section. 

The fact that a decomposition set is a set 
of Boolean variables makes the problem of 
transferring the batches to the cores very sim- 
ple. Indeed, let = {xi^, . . . ,Xi^} be some 
decomposition set. And let M be the number 
of computing cores in the DCS. The core with 
the number p e {1, . . . , A/} we denote by Cp. 
For the sake of simplicity, assume that M = 
2'"', fc G Ni, and fc < d. If we suppose that all 
the tasks in the decomposition family A, (C), 
generated by X^, have approximately equal 
complexity, then when solving the problem in 
the DCS each core is going to process approx- 
imately the same number of tasks. This means 
that the decomposition family A^, (C) can be 
partitioned into 2*^ subfamilies of equal power 
and each subfamily can be further processed 
entirely on the corresponding core. For this 
purpose select in X, some subset X^ of power 
k {X'^ can be formed, for example, by the first 
k variables from X^). The description of the 
job batch for a particular Cp.p G { 1, . . . , 2*^}, 
is a binary vector ap of the length k, formed 
by the values of variables from X^. Next, for 
each Cp.p = 1, . . . , 2*^, we consider the set Ap, 
consisting from 2'^^^ different vectors of the 
length d of the form (apl/S), where [3 takes all 
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2d-k possible values from the set {0, 1}''"'^. 

Each core ep,p & { 1, . . . , 2^^}, receives its job 
batch from the control process as a vector ap 
which is used for constructing the set Ap. A 
subfamily of the family A, (C) processed on 
Cp is obtained as a result of substitutions of 
the vectors from Ap to C. 



4.2 Cryptanalysis of some keystream ge- 
nerators on a low-performance computing 
cluster 

It is possible to successfully solve problems of 
cryptanalysis of some cryptographically weak 
generators (such as the Geffe generator and 
the Wolfram generator) on an ordinary per- 
sonal computer with the use of SAT approach 
(see [23]). 

The use of the results described above en- 
abled to perform a successful parallel logical 
cryptanalysis of the following generators: the 
threshold generator (with the key length 80 
bits), the summation generator (with the key 
length 63 bits) and the Gifford generator (with 
the key length 64 bits). These generators are 
not considered to be cryptographically strong 
since there are several known attacks for them. 
However these attacks are sufficiently differ- 
ent from each other By contrast, main stages 
of parallel logical cryptanalysis are the same 
for all mentioned generators. 

For our experiments we used a low- 
performance computing cluster Blackford (see 
[24]). The computing node of the Blackford 
cluster has two processors Intel Xeon Quad- 
Core E5345 2.33 GHz and 8 GB of RAM. 
This cluster has 20 nodes of the described 
configuration. 

During the process of cryptanalysis of the 
threshold and the summation generators it 
was found out that for the successful solving 
of these problems it is necessary to include 
into a decomposition set the variables encod- 
ing full initial contents of several LFSRs. The 
CNF obtained after the substitutions of the 
values of these variables proved to be very 
simple for the SAT solver This fact allows 
us to calculate the value of the predictive 
function for the corresponding decomposition 
quickly. Then, we remove one variable from 
the decomposition set and calculate the value 
of the predictive function for the obtained 



set again by the algorithm described above 
(Section 3) . This procedure is repeated until all 
variants X from some O are tested. Thus for 
each generator considered below relationship 
among all variants of X is subject to the rule 

Xi D X2 D ■ ■ ■ D Xr = X^ D . . . D Xs, 

\Xi+i \ = \Xi \ — 1, i = 1, . . . ,s — 1 for some s. 

The threshold generator was proposed by 
J. Bruer in [25] (see also [26]). This generator 
contains R,R > 3 LFSRs, that are shifted 
simultaneously. On the initial step r = the 
bits of the secret key are placed into LFSRs. At 
each moment of time r, r G {1,2,...} output 
bits of LFSRs are used as arguments of the 
majority function. The value of the majority 
function is 1 if the majority of its input bits 
are 1, and otherwise (see Fig. 2). The output 
bits of the majority function (at each moment 
of time) form the keystream. 

Parallel logical cryptanalysis was applied to 
the threshold generator based on five LFSRs 
given by the following connection polyno- 
mials: LFSR 1: X^^ + A:1" + X^ + X^ + 1; 
LFSR 2: X^^+X^'^ + X'^ + X + 1; LFSR 3: X^^ + 
^13 + x^ + X^ + l- LFSR 4: X^'^ + X*^ + X-^ + 
X^ + I- LFSR 5: X^^ + X^^ 



LFSR 1 


^1 








LFSR 2 






T 


LFSRR 






X'' +X' 



1. 



Fig. 2. Tliresliold generator. 

Thus, the length of the secret key in the 
considered generator was 80 bits. We ana- 
lyzed first 150 bits of keystream. The initial 
decomposition set X\ was formed by the vari- 
ables encoding the contents of the first three 
LFSRs. The usage of the predictive function 
technique resulted in the construction of the 
set AT*, which contains the variables encoding 
full initial contents of the first two LFSRs, plus 
one variable corresponding to the rightmost 
bit of the third LFSR. We generated 10 random 
tests, where "true random" sequences (see 
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[27]) were used as secret keys. The results of 
cryptanalysis are presented in Table 1 (the cor- 
rect secret keys were found in all tests) . During 
the cryptanalysis of the threshold generator on 
a cluster the effect of super-linear acceleration 
was observed. In our opinion this can be 
explained by the practical incompleteness of 
the SAT solvers we use (see Section 3). 

The summation generator was proposed by 
R. Rueppel (see [28], [11]). In this generator, 
like in the threshold one, each bit of the key- 
stream is an output of a nonlinear function. 
The function's inputs are outputs of several 
simultaneously shifted LFSRs (see Fig. 3). 

The only difference between this generator 
and the threshold one is usage of a special 
summation function (see [11]) instead of the 
majority function. The summation function 
uses Carry -register, that is dynamically up- 
dated during the work of the generator The 
secret key consists of the initial contents of the 
Carry-register and LFSRs. 



Carry 



LFSR 1 



LFSR 2 



LFSR R 



Z 



- keystream 



Fig. 3. Summation generator. 



On the initial step r = the bits of the 
secret key are placed into the Carry-register 
and LFSRs. At each moment of time t = 
1,2,... LFSRs simultaneously output the bits 
zj,j = on the basis of which the 

following values: 

R 
i=l 

are calculated. The bit ip^ ~ {Sr,mod2) is 
outputted to the keystream, and the binary 
representation of the number Cr is placed into 
the Carry-register 

Parallel logical cryptanalysis was applied to 
the summation generator based on the follow- 
ing four LFSRs: LFSR 1: X^^+X^ + X^+X + l; 



LFSR 2: X^^+X^ + X^ + X^ + l\ LFSR 3: X^^ + 
X^+X-^+X + l; LFSR 4: X^'' +X^+X^+X^ + 1. 

Thus, two unknown bits of the initial con- 
tents of the Carry-register and 61 bits of the 
initial contents of the LFSRs together form 
the secret key of the length of 63 bits. We 
analyzed the first 180 bits of the keystream. 
Using the technique of predictive functions we 
constructed a decomposition set X^, formed 
by the variables encoding the initial contents 
of the first two LFSRs (28 variables). The re- 
sults of parallel logical cryptanalysis of this 
generator are shown in Table 1. 

The Gifford generator was developed by a 
group led by J. Gifford in 1984 (see [29]) and 
for quite a long time it was used in practice 
for the transmission of text information. The 
first successful attack on the Gifford generator 
is described in the article [30], where a very 
complicated mathematical apparatus specially 
developed for cryptanalysis of this generator 
was presented. A distinctive feature of the 
Gifford generator is that it does not use LFSRs. 

The algorithm of the generator processes 
the information in groups of 8 bits. On the 
initial step t = 0, in the cells Bi, B2, ■ ■ ■ , Bg 
the bytes 6", . . . , 6g of the secret key (the total 
length is 64 bits) are written (see Fig. 4). The 
keystream is a sequence of bytes Ti,T2, . . . , 
outputted by the generator at the moments of 
time T e {1,2,...}. At each step contents of 
the cells Bi, B2, ■ ■ ■ , Bs is shifted by one byte 
to the right. At the same time old content of 
the cell Bs is discarded and the new content 
of the cell Bi is calculated using the feedback 
function: 



bl®(»l (&5))e(«i (bl)). 



Notation >>j means operation of right shift 
by one bit with the preservation of the high- 
order bit (sticky right-shift). Notation <<i 
means the operation of left shift by one bit 
with shifting in zero in the low-order bit (zero- 
fill left shift). I.e. for B = (xi, X2, . . . , xs) we 
have 

>>1 (B) = {xi,Xi,X2, ■ . ■,X7), 

«i {B) = {x2,X3, . . . ,xs,0). 

To calculate a byte of a keystream a nonlinear 
output function h : {0, 1}'^^ — > {0, 1}* is used. 
This function gets four input bytes (the con- 
tents of the cells Bi, B3, B^, Bs), and outputs 
a single byte: 

h{Bi, Ba, B5, Bs) = ExtractBytesiiBilBs) X (BalBa)). 
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Here | is a concatenation of the contents of 
the corresponding cells, and x is an integer 
multiplication. Thus, the argument of the func- 
tion ExtractBytes is a natural number. Binary 
representation of this number is 32-bit vector. 
Output of the function ExtractByte^ (x) is the 
third byte from the left of this vector 



Keystream T,,!^, 







63 






Be 


Br 


Be 




TABLE 1 

Results of cryptanalysis (10 tests for each 
generator) 



Generator 
secret key 
length / 
analyzed 
fragment of 
the 


Time of parallel 
solving on 80 cores of 
the Blackford cluster 


Time of 
solving 
on a 
single 
core 


keystream 


Min 


Max 


Aver 




Threshold 
5 LFSR, 
80 bits / 
first 150 bits 


10 

min. 


46 

min. 


27 

min. 


> 2 days 
(inter- 
rupted) 


Summation 
4 LFSR, 
63 bits / 
first 180 bits 


1 h. 

14 min. 


7 h. 

43 min. 


3 h. 

21 min. 


> 2 days 
(inter- 
rupted) 


Gifford, 
64 bits / 
first 160 bits 


7 min. 


9 h. 

53 min. 


4 h. 

57 min. 


for some 
tests less 
than 1 
day 



Fig. 4. Gifford generator. 



The structure of decomposition sets for Gif- 
ford generator is completely different from 
the structure of these sets for the generators 
that use LFSRs. Note that the initial content 
of LFSR defines all of its subsequent states. 
The structure of decomposition sets in the 
problems of cryptanalysis of the threshold 
and summation generators is determined by 
this very fact. Since the Gifford generator 
doesn't use LFSRs, the following simple strat- 
egy for constructing the initial decomposition 
set proved to be the most efficient: the set Xi 
consisted of the variables encoding values of 
the first 32 bits of the secret key. Then the 
value of the predictive function was improved 
according to the method similar to that consid- 
ered above: each new decomposition set was 
obtained from a previous one as a result of the 
removal of some variable. We analyzed first 
160 bits of the keystream. 

In the end, the technique of predicting func- 
tions applied to the Gifford generator gave the 
following result: if the number of computing 
cores of a cluster is 2'^, then one needs to 
parallelize the SAT problem in k variables 
corresponding to the first k bits of the secret 
key. Therefore, in the case of the Gifford gen- 
erator the power of the decomposition family 



A,(C) coincides with the number of comput- 
ing cores. For our experiments we had 80 cores 
available on the cluster Since 80 is not a power 
of 2 the original SAT problem was parallelized 
in 6 variables which resulted in 64 tasks. To 
load the remaining cores we divided 15 of 
the initial 64 SAT problems into two subtasks 
each, thus obtaining 79 tasks. One remaining 
core performed the controlling functions. 
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Fig. 5. Fragment of the process of optimization 
of a predictive function for the threshold gener- 
ator. 

On the Fig. 5 we give an example of a 
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graph displaying process of optimization of a 
predictive function for the threshold generator 
cryptanalysis. The white columns indicate that 
for the corresponding variants of the decom- 
position set the calculation of the predictive 
function was interrupted because of exceeding 
the current threshold value (see Section 3). 

5 Cryptanalysis of the genera- 
tor A5/1 IN A Grid system 

5.1 Construction of a decomposition set 

The problem of constructing a good decompo- 
sition set for the parallel cryptanalysis of A5/ 1 
proved to be quite nontrivial. We investigated 
different variants of constructing of decom- 
position set. First, using some "reasonable" 
assumptions, we constructed a basic decom- 
position set X'. Then we tried to reduce it 
applying the predictive function technique. 

We propose to include into the decomposi- 
tion set X' the variables encoding the initial 
states of the cells of registers, starting with 
the first cells until the cells containing clocking 
bits inclusive (corresponding cells in the Fig. 

6 are dark shaded). Thus, the decomposition 
set X' consists of 31 variables: 



X' = {xi 



,Xg,X2Q,...,X3Q,X4,2,---,X52} (5) 



This choice is motivated by the following 
considerations. Assigning values to all vari- 
ables from X' we determine the exact values 
of clocking bits for a large number of subse- 
quent states of all three registers. These clock- 
ing bits are the most informative because they 
determine the value of the majority function. 
The fact that we couldn't further reduce the 
set X' (5) by applying the predictive function 
technique was quite unexpected. In our statis- 
tical experiments for each variant of decompo- 
sition set and the initial fragment of the key- 
stream of some fixed length we constructed 
random samples of the volume of 1000 CNF. 
For each such sample we calculated the value 
of the predictive function (see Section 3) . In the 
Table 3 we show the values of the predictive 
function calculated for different variants of 
decomposition sets and keystream lengths. All 
of the variants of decomposition sets, nev- 
ertheless, are conceptually similar to (5). For 
example, a decomposition set consisting of 30 



Hl9 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 21 k-| 



H41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 -^1 



H64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 4645 44 43 42k-] 



¥7 



Fig. 6. Scheme of a decomposition set consist- 
ing of 31 variables. 



variables was formed according to the scheme 
on Fig. 7. 

In all computational experiments SAT prob- 
lems were solved using a modified variant of 
the SAT solver Minisat-C vl.14.1 (the details 
of modification are described in Section 4) . As 
a test platform a single core of the processor 
Intel E8400 + 2Gb RAM was used. 



Hl9 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 21 hn 



H41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20H 



H64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 4645 44 43 42k-| 



Fig. 7. Sclieme of a decomposition set consist- 
ing of 30 variables. 

Next, we present the prognosis of the time 
required for solving the problem of cryptanal- 
ysis of the generator A5/1 on the compu- 
tational cluster SKIF MSU "Chebyshev" (see 
[31]) that consists of 1250 quad-core processors 
E5472 (the peak performance of the cluster is 
60 Tflop/s). Table 2 shows the comparative 
characteristics of the processor Intel E5472 and 
the processor Intel E8400. 

From this table we can see that the cores 
of Intel E8400 and Intel E5472 are comparable 
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in power (there is only a slight difference in 
the bus frequency). Because of this we think 
it is reasonable to use the results of the nu- 
merical experiments shown in the Table 3 for 
the estimation of the time required for the 
solving the problem of logical cryptanalysis 
of the generator 5/1 on the cluster SKIP MSU 
"Chebyshev". In addition, note that the use of 
the technique of job batches transfer described 
in the previous section, makes the impact of 
transfer costs on the overall computation time 
quite negligible. 

TABLE 2 
Characteristics of processors 



Processor model 


Intel E8400 


Intel E5472 


Number of cores 


2 


4 


Core frequency 


3.0 GHz 


3.0 GHz 


Bus frequency 


1333 MHz 


1600 MHz 


Cache L2 


6 Mb 


12 Mb 



TABLE 3 

Values of the predictive function for a single 
core of the processor Intel E8400 for the 
generator A5/1 cryptanalysis (in hundreds 
millions of seconds) 



Power of the 


Length of keystream 


decomposition set 


128 


144 


160 


176 


192 


29 


3.87 


3.80 


3.69 


3.95 


3.76 


30 


3.65 


3.59 


3.59 


3.71 


3.83 


31 


3.76 


3.55 


3.71 


3.73 


3.81 


32 


4.23 


4.15 


4.27 


4.39 


4.32 


33 


4.70 


4.87 


4.89 


4.95 


5.23 



From all the above, our estimation of the 
computing time of logical cryptanalysis of 
A5/1 on the "Chebyshev" cluster is 9-12 
hours in average. The corresponding param- 
eters of the decomposition are: the power of 
decomposition set — 31 variables (Fig. 6), the 
number of analyzed bits of the keystream — 
144 (first bits of the stream). 

5.2 The necessity of application of dis- 
tributed computing technologies to the 
problem of cryptanalysis of the generator 
A5/1 

The predicted time for solving of this problem 
on the supercomputer "Chebyshev" shows 



that even if the cluster is fully dedicated to 
this task, the process of solving requires con- 
siderable time. Exclusive use of public access 
multiprocessor computing complexes is usu- 
ally not possible. At the same time, the re- 
searchers often have resources of various clus- 
ters. Grid systems, high-performance servers 
at their disposal. The software complex BNB- 
Grid [32] makes it possible to use such het- 
erogeneous distributed computing resources 
(called computing nodes) for solving complex 
computational problems. It has already shown 
high efficiency in application to several large 
scale optimization problems [33]. 

The structure of the computational algo- 
rithm for solving the problem of cryptanalysis 
of the generator A5/1 allows an efficient im- 
plementation in parallel and distributed sys- 
tems. This is possible because subtasks of the 
decomposition family can be processed inde- 
pendently. Next, we describe the process of 
solving the problem of cryptanalysis of A5/1 
in the BNB-Grid which uses computational 
resources of several multiprocessor systems. 



5.3 Computing complex BNB-Grid 

Hierarchical structure of BNB-Grid is shown 
on the Fig. 8. On the top level the ob- 
ject Computing Space Manager (CS-Manager) 
is located. It decomposes the original prob- 
lem into subproblems and distributes them 
among the computing nodes. For each com- 
puting node there is a corresponding object 
of the type Computing Element Manager (CE- 
Manager). CE-Manager provides communica- 
tion between CS-Manager and the correspond- 
ing computing node and also starts and stops 
applications on this node. After receiving a 
task from the CS-Manager, CE-Manager trans- 
fers it to the corresponding node and starts 
MPI application BNB-solver which processes 
the received task on all available cores. 

The core of the system is implemented in 
Java programming language using the mid- 
dleware Internet Communication Engine (ICE) 
[34] — an analog of CORBA. The graphical user 
interface is also implemented as an ICE-object, 
indicated on the figure as GUI Manager ICE- 
objects are located either on one or on several 
computers within a local network. 
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Fig. 8. Organization of computations in BNB- 
Grid. 

Several copies of the BNB-Solver application 
can be started on one computing node. This 
approach is often proved to be reasonable for 
shared access supercomputers working under 
the control of batch processing systems. In 
such systems an application requesting a large 
number of processors can be queued for a long 
time waiting for an appropriate "window" 
in the tasks schedule. At the same time, an 
application requesting a significantly smaller 
number of processors can be launched earlier. 

5.4 Parallel solving of SAT problems in 
BNB-Grid 

A module for processing SAT problems on 
a computing cluster was added to the BNB- 
Solver The input data of the control object 
CS-Manager is a description of the original 
SAT problem in XML format. CS-Manager 
decomposes SAT problem for the original CNF 
C and obtains decomposition family A* (C). 
CE-Manager transfer job batches, constructed 
according to the technique described in the 
Section 4, to the computing nodes. 

For better efficiency of the solving pro- 
cess some reasonable compromise between the 
number of job batches and the number of 
tasks in a batch should be achieved. A large 
number of tasks in a batch allows us to reduce 
idle time of the processors. However when 
the number of tasks in a batch is too large 
its processing time increases greatly. But this 
is undesirable because an application which 



requests too many resources of a cluster may 
be queued for a long time or even will not be 
run at all. 

For solving of the problem of logical crypt- 
analysis of generator A5/1 in the BNB-Grid 
a following decomposition was used. The de- 
composition family itself consisted of 2^^ SAT 
problems. It was split into 2^^ disjoint subsets 
(job batches) with 2^'^ SAT problems in each 
of them. These job batches were processed by 
BNB-solvers. 



5.5 Computational experiments 

In our experiments three test problems of 
cryptanalysis of generator A5/1 were solved. 
The computations were carried out on four 
computing clusters, which characteristics (see 
Table 4) are taken from the 11-th edition of 
the list of the most powerful supercomputers 
in the CIS [35]. 

The number of simultaneously working 
computing cores varied in the process of cal- 
culations from to 5568, averaging approxi- 
mately 2-3 thousand cores. For each test com- 
putations were stopped after finding the first 
satisfying assignment. The first test problem 
was solved (the secret key of the generator 
was found) in 56 hours, the second and the 
third — in 25 and 122 hours of Grid system 
work. 

The problem of cryptanalysis of generator 
A5/1 is also interesting because the same key- 
stream of arbitrary length can be generated 
from different secret keys. This fact was noted 
by J. Colic in [36] . We denote these situations 
as "collisions" using the evident analogy with 
the corresponding notion from the theory of 
hash functions. The approach presented in 
this article allows us to solve the problem of 
finding all the collisions of the generator A5/1 
for a given fragment of a keystream. Using 
BNB-Crid we found all collisions for one test 
problem (we analyzed the first 144 bits of 
keystream). It turned out that there are only 
three such collisions (see Table 5). Processing 
this test problem took 16 days of Grid system 
work. 
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TABLE 4 

Characteristics of computing clusters 



Name 


Institution 


Processors 


Number 
of cores 


MVS-lOOk 


Joint Super- 
computer 
Center of 
RAS 


Xeon E5450 
3 GHz 


7920 


SKIF-MSU 
Chebyshev 


Moscow 
State 

University 


Xeon E5472 
3 GHz 


5000 


Cluster of 
RRC 


RRC 

Kurchatov 
Institute 


Xeon 5345 
2.33 GHz 


3456 


BlueGene P 


Moscow 
State 

University 


Power PC 
850 MHz 


8192 



TABLE 5 

Original key and collisions of generator A5/1 
(in hexadecimal format) 





LFSR 1 


LFSR 2 


LFSR 3 




XI,. . . ,xig 


X20, ■ ■ ■ , X4,i 


X42, . . . , X64, 


orig. key 


2C1A7 


3D35B9 


EEAF2 


collision 


2C1A7 


3E9ADC 


EEAF2 


collision 


2C1A7 


3D35B9 


77579 



this task was quite time-consuming (from 
1 to 16 days), all the tests were correctly 
solved. Thus, the possibility of cryptanalysis 
of A5/1 in public access computing environ- 
ments without using of special computing 
architectures (like for example in [37]) was 
experimentally confirmed. 

An implementation of the described tech- 
nology in more powerful Grid-systems or on 
high performance clusters makes the problem 
of cryptanalysis of A5/1 solvable within sev- 
eral hours. We particularly emphasize that our 
approach makes it possible to find a secret key 
using only very small fragment of keystream 
(first 144 bits). This result can be considered 
as one of many votes against the widespread 
use of the cipher A5/1. On the webpage [38] 
CNFs encoding the problem of cryptanalysis 
of the generator A5/1 are available. 

Let us emphasize that the main purpose 
of the present work was the development of 
a technology for solving inversion problems 
for polynomially computable discrete func- 
tions in distributed computing environments. 
The results of cryptanalysis presented in the 
article were not the ultimate goal itself and 
should be considered only as arguments for 
the efficiency of the proposed technology. 



6 Conclusion 

In this work a parallel technology for solv- 
ing inversion problems for discrete functions 
computable in polynomial time is presented. 
This technology is based on a reduction of the 
considered problems to SAT problems. Using 
the information about the input variables of 
the considered function we construct a decom- 
position of the corresponding SAT problem 
into a family of subproblems. Then, this family 
is processed as a parallel task list in a dis- 
tributed computing environment. To construct 
a good decomposition we use technique of 
optimization of a special predictive function. 

The technology presented in the work 
was tested on problems of cryptanalysis of 
some keystream generators (threshold, sum- 
mation, Gifford generator). These problems 
were solved on a low-performance computing 
cluster. For solving the problem of cryptanal- 
ysis of the keystream generator A5/1 a Grid- 
system was specially constructed. Although 
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